The Burrows-Wheeler Transform: Theory and Practice

نویسنده

  • Giovanni Manzini
چکیده

In this paper we describe the Burrows-Wheeler Transform (BWT) a completely new approach to data compression which is the basis of some of the best compressors available today. Although it is easy to intuitively understand why the BWT helps compression, the analysis of BWT-based algorithms requires a careful study of every single algorithmic component. We describe two algorithms which use the BWT and we show that their compression ratio can be bounded in terms of the k-th order empirical entropy of the input string for any k ≥ 0. Intuitively, this means that these algorithms are able to make use of all the regularity which is in the input string. We also discuss some of the algorithmic issues which arise in the computation of the BWT, and we describe two variants of the BWT which promise interesting developments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Universal Data Compression Based on the Burrows-Wheeler Transformation: Theory and Practice

ÐA very interesting recent development in data compression is the Burrows-Wheeler Transformation [1]. The idea is to permute the input sequence in such a way that characters with a similar context are grouped together. We provide a thorough analysis of the Burrows-Wheeler Transformation from an information theoretic point of view. Based on this analysis, the main part of the paper systematicall...

متن کامل

Output distribution of the Burrows - Wheeler transform ' Karthik

The Burrows-Wheeler transform is a block-sorting algorithm which has been shown empirically to be useful in compressing text data. In this paper we study the output distribution of the transform for i.i.d. sources, tree sources and stationary ergodic sources. We can also give analytic bounds on the performance of some universal compression schemes which use the Burrows-Wheeler transform.

متن کامل

Wheeler Graphs: Variations on a Theme by Burrows and Wheeler

The famous Burrows-Wheeler Transform was originally defined for single strings but variations have been developed for sets of strings, labelled trees, de Bruijn graphs, alignments, etc. In this talk we propose a unifying view that includes many of these variations and that we hope will simplify the search for more. Somewhat surprisingly we get our unifying view by considering the Nondeterminist...

متن کامل

FUNCTIONAL PEARLS Inverting the Burrows-Wheeler Transform

Our aim in this pearl is to exploit simple equational reasoning to derive the inverse of the Burrows-Wheeler transform from its specification. We also outline how to derive the inverse of two more general versions of the transform, one proposed by Schindler and the other by Chapin and Tate.

متن کامل

A four-stage algorithm for updating a Burrows-Wheeler transform

We present a four-stage algorithm that updates the Burrows-Wheeler Transform of a text T , when this text is modified. The Burrows-Wheeler Transform is used by many text compression applications and some self-index data structures. It operates by reordering the letters of a text T to obtain a new text bwt(T ) which can be better compressed. Even if recent advances are offering this structure ne...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999